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APPLIED RESEARCH IN A CURRICULUM PROJECT: 

Accomplishments and Limitations 

* 

Wayne W. Welch 
Harvard Project Physics 

There Is general agreement that an important part of national 

* 

curriculum projects Is a comprehensive program of evaluation* Although 

many projects give lip service to this goal, there is little evidence 

* 

to support their claims* Of 68 projects reviewed In the 1968 Report 
on Science and Mathematics Curricular Developments , 1 only 19 indicate 
the availability of research evidence demonstrating achievement of 
their objectives. But since only 6 of these 19 bothered to use randomly 

selected control groups, one seriously questions the generalizability 

* 

of the results that are available. 

Evaluation as conducted by national projects for the most part 

has consisted of large quantities of feedback and some achievement 

testing. The feedback is usually unstructured and typically includes 

teacher reactions, classroom visits by staff, anecdotal reports and 

* 

professional opinion. All ^7 of the 68 projects who reported conducting 
any kind of evaluation included one of these activities. While feed- 
back information may give project personnel some feeling of whether 
or not they are achieving their hopes and ambitions, It is fraught 

t 

with dangers of sentiment and subjectivity. This unstructured feedback, 
so often the only evaluation conducted by a curricular development 
group, reminds me of playing a basketball game, not keeping score and 
then circulating among the spectators asking opinions on who won 
the game* 

Paper presented at the 1969 annual meeting of the American Educational 
Research Association, Los Angeles, California, February 6, 1969. 
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On second thought, because control groups are seldom used, a 
better analogy might be only one team playing, and then asking spec- 
tators what they thought about the teams 1 chances for success* The 

point to be made out of these analogies Is that evaluation opinion 

* 

needs to be supported by some hard-nosed data. Obtaining this data 
is the purpose of applied research in curriculum projects, 

"Applied research" includes those activities that are designed 

to gather information useful in making decisions about a specific 

* 

curriculum or course. Applied research differs from basic research 
in being oriented to a specific curriculum, rather than to variables 

0 

common to many curricula, Basic research might involve the question 
"What, if any, is the effect of teacher personality on the social 
climate of learning?" This is a general question, appropriate to many 

9 

courses. Applied research, on the other hand, involves questions about 
specific programs; for example, "What mathematical knowledge is pre- 
requisite to success in the CHEM study course?" The latter question 

is of interest primarily to those developing the course and to those 

0 

who are considering using it. 

On many occasions, applied and basic research interact with 

t 

and complement each other. But for the purposes of this paper, they 

t 

will be treated as separate endeavors. It is useful to think of two 
separate components of applied research. One is designed to answer 

questions about improving a curriculum during its development (some- 

* 

times called "formative evaluation"). The other is designed to answer 

* 

questions about the final product (usually called "summative evaluation"). 




Because we have spent a great deal of time and energy on both these 

aspects of evaluation. It may be of some interest to outline some of 

« 

the advantages and pitfalls we have Identified* 

Formative evaluation is conducted for the purpose of course Im- 

* 

provement. Most projects usually do some kind of casual formative 

evaluation if provisions for revision are written into their develop- 

* 

ment plans. In addition to the usual verbal and highly subjective 
kinds of formative evaluation, it is possible to carry out more struc- 

9 

tured and objective investigations. For example, studies of readability, 
the length of materials, achievement test results, semantic differen- 
tial tyi^ questionnaires, inventories, student questionnaires, ratings of 
teachability and interaction analysis all can be used to some extent. 

But each of these methods has its own strengths and limitations and 
the evaluator must be careful not to fall into the trap of believing 
his program is complete just because it is based on one or several 

of these methods. Information gathered from each kind of evaluation 

0 

study can be used to answer different questions. 

To give some idea of the success and limitations of a particular 
type of formative evaluation study, I have chosen one of the several 
questions investigated by our evaluation group and also carried out 

t 

by several other projects. Namely, n Do the results of the achievement 
tests indicate areas where the course could be Improved? 11 I want to 
describe our procedure briefly and point out some of the problems and 

9 

successes we encountered. 



-4- 



In the second trial year of the Project, an achievement test was 

* 

written for each of the six units of the course. To provide infor- 
mation on which to base revisions, the authors were given item anal- 

$ 

yses of each of the tests. In this way, various parts of the text 
that were not conveying their message to the students would be identi- 
fied and authors could concentrate their efforts on improving these 

# 

areas. 

The test writing process had several practical problems to over- 
come, perhaps most important was the severe time restriction on pro- 

0 

ducing the tests. Only about six weeks were available to write, revise, 
print, and have each test to the teacher by the time the unit was 

4 

finished. 

The course was being tried by 1 6 teachers and 500 students from 

all parts of the country so test arrival and responses were subject 

to the whims of the TJ.S. Post Office. However, we found that it was 

possible to write a test, have it printed, mailed, and in the teacher's 

0 

hands within the six week limitation. 

Data were obtained on each of the tests (subject to the varying 

promptness of teachers returning the answer sheets), and a complete 

* 

item analysis performed. Means, standard deviations,^ item discrimin- 
ation indices, success levels, and test reliabilities were computed. 

Item success levels were sent to revision authors together with the 

« 

percentage of students choosing each distractor. Thus, we had estab- 
lished what I consider a typical example of formative evaluation, and 
one that I thought would be found in several curriculum projects. 
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However, only 7 of 68 projects reviewed in the AAAS Report 

* 

indicated they had used test results in this manner. 

# 

This procedure enjoyed some limited success. Several items with 
high difficulty levels indicated sections of the text that required 

rewriting. A few items helped identify topics where existing explan- 

* 

ation seemed adequate s However, there were practical problems in 

* 

this sequence that future curriculum groups should be aware of. 

Perhaps the greatest problem is the time it takes to receive 

answer sheets from teachers scattered throughout the country, key 

punch the results, and run the cards through the item analysis program. 

Regardless of our efforts to shorten it, approximately three or four 

* 

months were required. Unfortunately, a large portion of the text 

revision would already be completed by the time the item analysis 

results were available to the authors. Furthermore, several authors 

were post hoc suspicious of the test items and their ability to probe 

« 

the understandings the authors believed to be in the text. If a set 

of item statistics Indicated a section was not being understood, these 

authors were more likely to criticize the tests, than to criticize 
» 

the text. This was sometimes the case even though they Initially had 

4 

approved the items. 

Another problem we encounterd is the extreme difficulty an author 
has in trying to make a concept or idea more understandable once it 
has been determined that students have failed to grasp the idea. He 

may know that students do not understand Newton 1 s Third Law of Motion, 

# 

but may not be able to revise the text to make it simpler* It is 
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entirely possible the revision will make the topic even more difficult* 
These comments on the value of this jype of formative evaluation 

4 

do not imply failure, but rather the existence of limitations. I 
believe that it does help improve a course and has a definite place 

4 

in an evaluation program. However, we must recognize that its use- 
fulness is limited by a number of practical factors, and that these 
factors must be considered when determining the relative emphasis, 

4 

timing, and importance to place on formative evaluation, 

I would now like to turn to the summative aspects of applied 
research in curriculum projects. These are activities carried out 
for the purpose of providing information to the eventual user of the 

4 

course. This information is valuable to the user (teacher, super- 

4 

visor, etc.) in so far as it helps him make decisions concerning 

0 

adoption and effective use of bhe course. In my opinion all projects 
have an obligation to describe their programs and to provide evidence 

4 

of success in achieving stated or implied objectives. 

Examples of several summative evaluation questions studied in 
connection with Project Physics includes 

4 

1 , , You state that one of the goals of your course is to 
increase enrollments in high school physics. What 
evidence do you have that this objective is being 
achieved? 

/ 

2 . In order to Increase enrollments in physics, you must 

appeal to groups of students who normally do not take * 
physics. How do these students perform in your course? 

/ 

J5. What teacher preparation is required to teach this new 
program effectively? 

/ 

4. How do Project Physics students perform on national 
examinations such as the College Boards? 
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5# Has the course been trial tested using typical physics 
teachers, rather than using selected physics teachers? 

6 9 What changes in attitude and interest have been identi- 
fied as a result of taking your course? 

To illustrate an example of summative evaluation and to point 

out its success and limitations, let me describe a study we conducted 

to answer the question, "What appeal does the course have to those 

students now successfully avoiding any study of physics?" Of special 

interest are girls since 95 out of 100 usually choose not to take 
/ 

physics. We wanted to determine if they responded more favorably 

to Project Physics than to other courses. 

During the 1966-67 trial year, a pilot study examined the rel- 

# 

ative success of boys and girls, A similar study was conducted again 
during 1967-68 using a * ;ional random sample of teachers with random 

4 

assignment to experimental and treatment groups. Included in the 
56 schools that were using the Project Physics materials for the first 

4 

time were 522 girls. In the 21 comparison schools there were 159 

4 

girls. In both groups the number of girls Is about of the total 

4 

physics population. Also, the initial Interest in physics and IQ 

r 

of the experimental and control groups were approximately the same. 
Without going Into detail regarding the instruments and method 
of analysis, we found that, indeed, girls in Project Physics did seem 

t 

to respond more favorably to that course. For example, on a measure 
of course satisfaction, they had a mean rating of ,60 while the com- 

1 / 

parison group rating was .47# As expected, Project Physics girls 
gained more on our achievement test, but they also showed greater 
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gains on several non-project cognitive measures. They saw physics as 
more diverse and less difficult than did the comparison group. And 
they received from their physics teachers higher course grades, 
certainly an important factor in determining course satisfaction. 

We examined change in interest in the subject over the year. 

We found that both groups of girls showed declines in interest as 

t 

measured by several different tests. For example, on a semantic 

differential composite interest score. Project Physics girls declined 
/ 

.5 pre test standard deviations while the comparison group declined 

, ' 

.6 standard deviations. Loss in interest is certainly not what 
course developers have in mind; it is mildly compensating that Pro- 
ject Physics girls did not decline as much as the comparison group. 

I should mention that several other studies have noted similar 
declines in interest in school sub jects. 2 ' 5 Reporting this finding 
to prospective Project Physics teachers might be hazardous, but we 
are hoping a straightforward presentation of all findings will not 
only indicate an honest evaluation program, but also help to increase 
our understanding of physics education. 

There are limitations in summative evaluation in addition to the 
possibility of negative findings. Of course we are limited in our 
evaluation by the precision of our instruments. There is some indi- 
cation of ceiling effects on the semantic differential test. We 
have also encountered problems of attrition; a teacher who was sup- 
posedly using our program wa 3 in fact using a traditional textbook, 
tests were lost during mailing, on a few occasions teachers did not 
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give tests, and perhaps most important, ohere were some serious delays 
on the part of our production staff to provide teachers with the 
necessary materials in time to use them in the schools* These prac- 
tical problems are very much a part of evaluation and impose their 
own peculiar limitation — certainly not insurmountable, but always 
in evidence to the applied researcher. 

It is too early to assess the impact of our summative evaluation 

4 

findings. We are writing a book for the professional researcher 
describing in detail our procedure and the results of both formative 

4 

and summative evaluation studies. A shorter pamphlet written for 
teachers, administrators, and guidance counsellors will also be 

4 

available to those inquiring about the course. We hope the infor- 
mation they find there will help them to make rational decisions 

* 

regarding adoption and use of the course. Once these documents are 
available and tried by teachers we will be in a better position to 

assess the accomplishments and limitations of applied research in our 

/ 

curriculum project. 
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